Search CORE

275 research outputs found

Compositional Embeddings Using Complementary Partitions for Memory-Efficient Recommendation Systems

Author: Mudigere Dheevatsa
Naumov Maxim
Shi Hao-Jun Michael
Yang Jiyan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 28/06/2020
Field of study

Modern deep learning-based recommendation systems exploit hundreds to thousands of different categorical features, each with millions of different categories ranging from clicks to posts. To respect the natural diversity within the categorical data, embeddings map each category to a unique dense representation within an embedded space. Since each categorical feature could take on as many as tens of millions of different possible categories, the embedding tables form the primary memory bottleneck during both training and inference. We propose a novel approach for reducing the embedding size in an end-to-end fashion by exploiting complementary partitions of the category set to produce a unique embedding vector for each category without explicit definition. By storing multiple smaller embedding tables based on each complementary partition and combining embeddings from each table, we define a unique embedding for each category at smaller memory cost. This approach may be interpreted as using a specific fixed codebook to ensure uniqueness of each category's representation. Our experimental results demonstrate the effectiveness of our approach over the hashing trick for reducing the size of the embedding tables in terms of model loss and accuracy, while retaining a similar reduction in the number of parameters.Comment: 11 pages, 7 figures, 1 tabl

arXiv.org e-Print Archive

Crossref

Methods for Quantized Compressed Sensing

Author: Case Mindy
Gu Xiaoyi
Needell Deanna
Shi Hao-Jun Michael
Tu Shenyinying
Publication venue: Scholarship @ Claremont
Publication date: 30/12/2015
Field of study

In this paper, we compare and catalog the performance of various greedy quantized compressed sensing algorithms that reconstruct sparse signals from quantized compressed measurements. We also introduce two new greedy approaches for reconstruction: Quantized Compressed Sampling Matching Pursuit (QCoSaMP) and Adaptive Outlier Pursuit for Quantized Iterative Hard Thresholding (AOP-QIHT). We compare the performance of greedy quantized compressed sensing algorithms for a given bit-depth, sparsity, and noise level

arXiv.org e-Print Archive

Scholarship@Claremont

Crossref

Optimizing quantization for Lasso recovery

Author: Case Mindy
Gu Xiaoyi
Needell Deanna
Plan Yaniv
Shi Hao-Jun Michael
Tu Shenyinying
Publication venue: Scholarship @ Claremont
Publication date: 09/06/2016
Field of study

This letter is focused on quantized Compressed Sensing, assuming that Lasso is used for signal estimation. Leveraging recent work, we provide a framework to optimize the quantization function and show that the recovered signal converges to the actual signal at a quadratic rate as a function of the quantization level. We show that when the number of observations is high, this method of quantization gives a significantly better recovery rate than standard Lloyd-Max quantization. We support our theoretical analysis with numerical simulations

arXiv.org e-Print Archive

Scholarship@Claremont

A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale

Author: Gallego-Posada Jose
Iwasaki Shintaro
Lee Tsung-Hsien
Li Zhijing
Mudigere Dheevatsa
Rabbat Michael
Rangadurai Kaushik
Shi Hao-Jun Michael
Publication venue
Publication date: 12/09/2023
Field of study

Shampoo is an online and stochastic optimization algorithm belonging to the AdaGrad family of methods for training neural networks. It constructs a block-diagonal preconditioner where each block consists of a coarse Kronecker product approximation to full-matrix AdaGrad for each parameter of the neural network. In this work, we provide a complete description of the algorithm as well as the performance optimizations that our implementation leverages to train deep networks at-scale in PyTorch. Our implementation enables fast multi-GPU distributed data-parallel training by distributing the memory and computation associated with blocks of each parameter via PyTorch's DTensor data structure and performing an AllGather primitive on the computed search directions at each iteration. This major performance enhancement enables us to achieve at most a 10% performance reduction in per-step wall-clock time compared against standard diagonal-scaling-based adaptive gradient methods. We validate our implementation by performing an ablation study on training ImageNet ResNet50, demonstrating Shampoo's superiority over standard training recipes with minimal hyperparameter tuning.Comment: 38 pages, 8 figures, 5 table

arXiv.org e-Print Archive

ESRRB regulates glucocorticoid gene expression in mice and patients with acute lymphoblastic leukemia

Author: Gallagher Kayleigh M.
Green Michael R.
Kelliher Michelle A.
Li Rui
Murphy Leonard
O\u27Connor Kevin
Roderick Justine E.
Sanda Takaomi
Tan Shi Hao.
Tan Tze King
Yu Jun
Zhu Lihua Julie
Publication venue: eScholarship@UMassChan
Publication date: 13/07/2020
Field of study

Synthetic glucocorticoids (GCs), such as dexamethasone and prednisone, remain key components of therapy for patients with lymphoid malignancies. For pediatric patients with acute lymphoblastic leukemia (ALL), response to GCs remains the most reliable prognostic indicator; failure to respond to GC correlates with poor event-free survival. To uncover GC resistance mechanisms, we performed a genome-wide, survival-based short hairpin RNA screen and identified the orphan nuclear receptor estrogen-related receptor-beta (ESRRB) as a critical transcription factor that cooperates with the GC receptor (GR) to mediate the GC gene expression signature in mouse and human ALL cells. Esrrb knockdown interfered with the expression of genes that were induced and repressed by GR and resulted in GC resistance in vitro and in vivo. Dexamethasone treatment stimulated ESRRB binding to estrogen-related receptor elements (ERREs) in canonical GC-regulated genes, and H3K27Ac Hi-chromatin immunoprecipitation revealed increased interactions between GR- and ERRE-containing regulatory regions in dexamethasone-treated human T-ALL cells. Furthermore, ESRRB agonists enhanced GC target gene expression and synergized with dexamethasone to induce leukemic cell death, indicating that ESRRB agonists may overcome GC resistance in ALL, and potentially, in other lymphoid malignancies

eScholarship@UMMS

The New Generation Atlas of Quasar Spectral Energy Distributions from Radio to X-rays

We have produced the next generation of quasar spectral energy distributions (SEDs), essentially updating the work of Elvis et al. (1994) by using high-quality data obtained with several space and ground-based telescopes, including NASA's Great Observatories. We present an atlas of SEDs of 85 optically bright, non-blazar quasars over the electromagnetic spectrum from radio to X-rays. The heterogeneous sample includes 27 radio-quiet and 58 radio-loud quasars. Most objects have quasi-simultaneous ultraviolet-optical spectroscopic data, supplemented with some far-ultraviolet spectra, and more than half also have Spitzer mid-infrared IRS spectra. The X-ray spectral parameters are collected from the literature where available. The radio, far-infrared, and near-infrared photometric data are also obtained from either the literature or new observations. We construct composite spectral energy distributions for radio-loud and radio-quiet objects and compare these to those of Elvis et al., finding that ours have similar overall shapes, but our improved spectral resolution reveals more detailed features, especially in the mid and near-infrared.Comment: 46 pages, 10 figures, 10 tables, Accepted by ApJS. Composite SED data files for radio-loud and radio-quiet quasars (rlmsedMR.txt, rqmsedMR.txt) are included in the source (Other formats -> Source). Supplemental figures are not include

arXiv.org e-Print Archive

Crossref

Texas ScholarWorks

Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species

Author: \uc9l\ue9nie Godzaridis
Adam M. Phillippy
Alexey Sergushichev
Anton Alexandrov
Benedict Paten
Binghang Liu
Bruno M. Vieira
Carson Qu
Daniel S. Rokhsar
Dariusz Przybylski
David B. Jaffe
David C. Schwartz
David Haussler
DEL FABBRO Cristian
Delphine Naquin
Dent Earl
Dominique Lavenier
Erich D. Jarvis
Fedor Tsarev
Filipe J. Ribeiro
Fran\ue7ois Laviolette
Francisco Pina Martins
Ganeshkumar Ganapathy
Giles Hall
Guillaume Chapuis
Guojie Zhang
Hamidreza Chitsaz
Hao Zhang
Henry Song
Huaiyang Jiang
Iain Maccallum
Ian F. Korf
Inan\ue7 Birol
Isaac Y. Ho
J. Ruby
Jacob O. Kitzman
Jacques Corbeil
James R. Knight
Jared T. Simpson
Jarrod A. Chapman
Jason Howard
Jay Shendure
Jianying Yuan
Joseph B. Hiatt
Joseph N. Fass
Jun Wang
Keith R. Bradnam
Kim C. Worley
Martin Hunt
Matthew D. Macmanes
Matthias Haimel
Michael C. Schatz
Michael Bechner
Michael Place
Nicolas Maillet
Nuno A. Fonseca
Oct\ue1vio S. Paulo
Paul J. Kersey
Paul Baranay
Pavel Fedotov
Rayan Chikhi
Richard A. Gibbs
Richard Durbin
Ruibang Luo
S\ue9bastien Boisvert
Sante Gnerre
Scalabrin Simone
Scott Emrich
Sergey Kazakov
Sergey Koren
Sergey Melnikov
Shaun D. Jackman
Shiguo Zhou
Shuangye Yin
Siu Ming Yiu
Stephen Richards
Steve Goldstein
T. Docking
Tak Wah Lam
Ted Sharpe
Thomas D. Otto
Timothy I. Shaw
Vezzi Francesco
Vicedomini Riccardo
Wen Chi Chou
Xiang Qin
Yingrui Li
Yue Liu
Yujian Shi
Zemin Ning
Zhenyu Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Background: The process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, acceptance of newer read technologies) and in their final output (composition of assembled sequence). More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. The Assemblathon competitions are intended to assess current state-of-the-art methods in genome assembly. Results: In Assemblathon 2, we provided a variety of sequence data to be assembled for three vertebrate species (a bird, a fish, and snake). This resulted in a total of 43 submitted assemblies from 21 participating teams. We evaluated these assemblies using a combination of optical map data, Fosmid sequences, and several statistical methods. From over 100 different metrics, we chose ten key measures by which to assess the overall quality of the assemblies. Conclusions: Many current genome assemblers produced useful assemblies, containing a significant representation of their genes and overall genome structure. However, the high degree of variability between the entries suggests that there is still much room for improvement in the field of genome assembly and that approaches which work well in assembling the genome of one species may not necessarily work well for another

Archivio istituzionale della ricerca - Università degli Studi di Udine